Cytometry Part A
○ Wiley
Preprints posted in the last 30 days, ranked by how well they match Cytometry Part A's content profile, based on 30 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Chen, Y.-L.; Zhang, C.; Lucas, F.; Hadlock, J.; Foy, B. H.
Show abstract
Introduction The complete blood count with differential (CBD) is one of the most commonly performed blood tests worldwide, used in nearly all areas of medicine. Although modern CBD analyzers generate flow cytometry based single cell measurements,the resultant CBD markers are limited to coarse summary features, such as total cell counts and average cell sizes. This means, the markers cannotdetect subtle cell population shifts that may signal early stage pathogenesis. To test this, we evaluate whether AI based analysis of the raw single cell data underlying the CBD can be used to develop novel, clinically prognostic biomarkers, across patient settings. Method We developed two complementary methods for biomarker discovery using CBD tests and evaluated them with longitudinal data from an academic medical center. To create interpretable biomarkers, we clustered cells into physiologically meaningful subpopulations and performed robust statistical summarization. In tandem, self supervised autoencoders were developed to extract novel nonlinear markers. We evaluated the utility of these clustering (CLS) and autoencoder (AE) markers for patient prognostication across a range of outcomes (mortality, inpatient admission, and future disease development). Results Our study included 242,623 CBD samples from 127,545 patients. Both clustering and embedding approaches successfully generated hundreds of new clinical biomarkers. Many biomarkers showed strong prognostic associations for all cause mortality, inpatient admission, and development of anemia, cancer, or cardiovascular disease, with associations remaining significant after adjustment for demographics and clinical CBD markers. A large subset of these prognostic markers also showed high novelty, having low correlations to existing CBD markers, while also exhibiting significant correlations with broader physiologic signals, such as inflammatory, hormonal, infectious, and coagulopathic markers. Conclusion Collectively, these results demonstrate how modern AI techniques can allow for deeper phenotyping of routine clinical blood counts, generating novel biomarkers that capture more subtle physiologic signals than what are currently clinically utilized.
Powell, S.; Bui, T.; Gullipalli, D.; LaCava, M.; Jones, S. M.; Hansen, T.; Kuhr, F.; Swat, W.; Simandi, Z.
Show abstract
Current clinical management of multiple myeloma (MM) relies on bone marrow (BM) biopsies for minimal residual disease (MRD) assessment. While BM biopsies are the gold standard, their invasive nature and potential to miss extramedullary or patchy disease necessitate sensitive, non-invasive liquid biopsy platforms. In this study, we evaluated the analytical performance of the CellSearch CMMC assay to determine its utility for deep-MRD monitoring. Using a standard 4 mL whole blood input, the assay achieves a WBC-normalized sensitivity of 2.45 x 10-7, supported by a limit of quantitation of 5 cells per run. Given this high analytical sensitivity, the assay provides a robust negative predictive value, rendering false-negative findings highly unlikely in populations with detectable peripheral disease. These findings characterize the CellSearch CMMC assay as a highly sensitive, analytically validated platform for non-invasive deep-MRD level longitudinal surveillance monitoring. When integrated into a clinical workflow that accounts for its specificity profile, the platform offers a patient-friendly complement to serial BM biopsies, with the potential to reduce their frequency in appropriate clinical contexts.
Idowu, A. M.; Ropa, J.; Hurwitz, S. N.
Show abstract
BackgroundCompetitive transplantation is essential for defining intrinsic repopulating capacity of murine hematopoietic stem and progenitor cells (HSPCs), yet comparable assays for human cells have been limited by the lack of a robust in vivo platform. MethodsHere, we describe a novel competitive transplantation method in humanized NOD.Cg-KitW-41J Tyr + Prkdcscid Il2rgtm1Wjl/ThomJ (NBSGW) mice that enables simultaneous engraftment and longitudinal tracking of distinct human grafts within a shared microenvironment. ResultsUsing human leukocyte antigen-mismatched donor CD34+ cells, this method facilitates standard flow cytometry panels to track multiple donor cell chimerism, lineage output, and HSPC composition. The experimental framework may be adapted to different mouse models, conditioning strategies, donor sources, and treatments. ConclusionsOverall, this humanized competitive repopulation assay fills a critical translational gap and offers a flexible foundation for advancing mechanistic discovery in human hematopoietic biology and improving clinical strategies for stem cell transplantation.
Merle, L.; Martin-Jaular, L.; Thery, C.; Joliot, A.
Show abstract
Extracellular vesicles are key intercellular messengers that modulate the function of target cells by carrying effectors, either at their surface or in their lumen. In the latter case, their action depends on the ability to deliver their content into the cytosol of target cells. How efficiently EVs deliver their content upon interaction with their target cell is thus a central question for understanding the functional impact of this mode of action. To address this question, signal-driven bimolecular interactions between two partners located respectively in the EV lumen and the target cell cytosol have become a widely used strategy to detect the cytosolic delivery EV content. However, the detection of cytosolic delivery with these assays was often tributary to the artificial enhancement of the fusion between EV and cell membranes, through for instance VSV-G fusogenic protein expression. Here we provide a robust and quantitative LUCiferase-based complementation assay (HiBiT/LgBiT), to quantify the Internalization and cytosolic Delivery of EV content: LUCID-EV. By optimizing the signal-to-noise ratio of the assay, the method for loading HiBiT fragment into EVs (fusion to a lipid-binding domain rather than to tetraspanins), and the intracellular position of LgBiT (associated to membranes), we could quantify cytosolic delivery from various non-VSV-G-expressing EVs into target immune dendritic cells. Importantly, this delivery did not involve the acidic late endosomes environment required for VSV-G-dependent EV cytosolic delivery. The limited efficacy of the process highlights the need for highly sensitive assays like the one described here. Further development of the LUCID-EV assay could help identifying EV/target cells pairs with enhanced cytosolic delivery properties and characterize the cellular route for delivery.
Pore, M.; Balamurugan, K.; Atkinson, A.; Breen, D.; Mallory, P.; Cardamone, A.; McKennett, L.; Newkirk, C.; Sharan, S.; Bocik, W.; Sterneck, E.
Show abstract
Circulating tumor cells (CTCs), and especially CTC-clusters, are linked to poor prognosis and may reveal mechanisms of metastasis and treatment resistance. Therefore, developing unbiased methods for the functional characterization of CTCs in liquid biopsies is an urgent need. Here, we present an evaluation of multiplex imaging mass cytometry (IMC) to analyze CTCs in mice with human xenograft tumors. In a single-step process, IMC uses metal-labeled antibodies to simultaneously detect a large number of proteins/modifications within minimally manipulated small volumes of blood from the tail vein or heart. We used breast cancer cell lines and a patient-derived xenograft (PDX) to assess antibodies for cross-species interpretation. Along with manual verification, HALO-AI-based cell segmentation was used to identify CTCs and quantify markers. Despite some limitations regarding human-specificity, this technology can be used to investigate the effect of genetic and pharmacological interventions on the properties of single and cluster CTCs in tumor-bearing mice.
Letort, G.; Valon, L.; Michaut, A.; Cumming, T.; Xenard, L.; Phan, M.-S.; Dray, N.; Rueden, C. T.; Schweisguth, F.; Gros, J.; Bally-Cuif, L.; Tinevez, J.-Y.; Levayer, R.
Show abstract
Investigating single-cell dynamics and morphology in tissues and embryos requires highly accurate quantitative analysis of microscopy images. Despite significant advances in the field of bioimage analysis, even the most sophisticated segmentation and tracking algorithms inevitably produce errors (e.g. : over segmentation, missing objects, miss-connected objects). Although error rate may be small, their propagation throughout a time-lapse sequence has catastrophic effects on the accuracy of tracking and extraction of single cell parameters. Extracting single cell temporal information in the context of tissue/embryo requires thus expert curation to identify and correct segmentation errors. In the movies commonly used in developmental biology and stem cell research, both the number of imaged cells and the duration of recording are large, making this manual correction task extremely time-consuming. This has now become a major bottleneck in the fields of development, stem cell biology and bioimage analysis. We present here EpiCure (Epithelial Curation), a versatile tool designed to streamline and accelerate manual curation of segmentation and tracking in 2D movies of large epithelial tissues. EpiCure uses temporal information and morphometric parameters to automatically identify segmentation and tracking errors and provides user-friendly tools to correct them. It focuses on ergonomics and offers several visualization options to help navigating in movies of tissue covering a large number of cells, speeding up the detection of errors and their curation. EpiCure is highly interoperable and supports input from a wide range of segmentation tools. It also includes multiple export filters, enabling seamless integration with downstream analysis pipelines. In this paper, using movies from several animal models, we highlight the importance of curating cell segmentation and tracking for accurate downstream analysis, and demonstrate how EpiCure helps the curation process for extracting accurate single cell dynamics and cellular events detection, making it faster and amenable on large dataset.
Prasad, A.; Patel, S.; Ng, S.; Liu, C.; Gelb, B. D.
Show abstract
AbstractThe lymphatic system is essential for maintaining fluid homeostasis, lipid transport and supporting immune function. Despite its central role in health and disease, advancements in understanding human lymphatic vasculature has been constrained, in part because primary human LECs are difficult to access and study in disease-relevant contexts. This study describes an efficient and scalable feeder-free method to differentiate human iPSCs into lymphatic endothelial cells (LECs) that are transcriptionally and phenotypically similar to primary fetal LECs. An iPSC-derived LEC system overcomes a drawback of primary cells by enabling precise genetic perturbations, supporting study of lymphatic diseases of interest in a human context. By grounding our approach in in vivo stages of lymphangiogenisis, we describe a staged protocol that recapitulates the key milestones of lymphatic development. We first adapted a published method to differentiate human iPSCs into venous endothelial cells (VECs) and then initiate transdifferentiation of VECs into LECs. Using immunocytochemistry, qPCR, as well as flow cytometry, we demonstrated expression of lymphatic-specific markers in the differentiated population. We further characterized our induced VECs (iVECs) and LECs (iLECs) through bulk RNA sequencing analysis and compared the populations to pseudobulk VEC and LEC transcriptomic datasets generated from human fetal heart endothelia at 12, 13 and 14 weeks of gestation. Through this work, we expanded the repertoire of approaches for accessing LECs, with the goal of accelerating discoveries in lymphatic biology and therapeutics. Abstract summary image O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=171 SRC="FIGDIR/small/712968v1_ufig1.gif" ALT="Figure 1"> View larger version (15K): org.highwire.dtl.DTLVardef@1a9a406org.highwire.dtl.DTLVardef@4faec6org.highwire.dtl.DTLVardef@15b4e73org.highwire.dtl.DTLVardef@17b9c36_HPS_FORMAT_FIGEXP M_FIG C_FIG
Canu, G.; Correra, R.; Plein, A. R.; Denti, L.; Fantin, A.; Ruhrberg, C.
Show abstract
Lymphatic vessels are formed during embryonic and postnatal development to facilitate interstitial fluid clearance and immune regulation after birth. Their organ-specific heterogeneity in organisation and function is preceded by heterogenous origins of lymphatic endothelial cells (LECs), the main building blocks of lymphatic vessels. In the dermis, a subset of LECs was reported to arise from blood capillaries, which themselves differentiate, in part, from paraxial mesoderm. However, it is not known whether additional cell lineages contribute to the dermal LEC population. Here, we have combined transcriptomic analyses with genetic lineage tracing and wholemount immunostaining to show that 60% of LECs in the embryonic day (E) 13.5 and E15.5 dermis are derived from a cell lineage that expresses Csf1r, a marker of myeloid cells and their progeny. Csf1r lineage LECs persist in adult dermal lymphatic vasculature and are indispensable for normal lymphatic development, because Prox1 deletion within the Csf1r lineage causes dermal oedema and blood-filled lymphatic vessels. As Csf1r lineage dermal LECs do not themselves express Csf1r and also do not arise from Csf1r-expressing differentiated myeloid cells, our findings imply the existence of a Csf1r-expressing non-LEC precursor population for the majority of dermal LECs and will prompt further work to identify this cell population.
Helsens, C.; Pili, F.; Vasquez, E.; Aymanns, F.; Tinevez, J.-Y.; Ando, E.; Oates, A. C.
Show abstract
Long-term live imaging of growing samples with light-sheet fluorescence microscopy provides unique insights into development, but morphogenesis often displaces features of interest outside the microscopes field of view (FOV), calling for automated methods to track these features and update the microscopes FOV in real time. Existing solutions, which typically rely on local or global intensity distributions, struggle to follow specific features robustly throughout morphogenesis, leading to truncated or incomplete datasets. Here, we present a light-sheet live tracking tool (LiLiTTool) that maintains user-defined regions of interest (ROI) within the FOV throughout extended imaging sessions. LiLiTTool uses Cotracker3, a state-of-the art deep learning-based motion predictor, augmented by sensor fusion with a trained object-detector. This enables robust compensation for drift, rotation, and deformation during morphogenesis, while meeting the timing constraints of live acquisition. We validated LiLiTool by integrating with the Viventis LS1 microscope, achieving sub-second processing and stable tracking of growing zebrafish embryos over many hours. LiLiTTool supports multi-ROI tracking in 3D, enabling simultaneous monitoring of multiple features within the same embryo and in multiple embryos during a single acquisition. LiLiTTool is modular and openly available on GitHub and as a napari plugin for post-acquisition tracking. By enabling precise, adaptive, and scalable real-time imaging, LiLiTTool advances smart microscopy approaches and provides the developmental biology community with a practical tool for capturing reliable spatio-temporal information in growing embryos or other morphogenetic systems.
Tetard, M.; Lin, T.; Peterson, N. A.; Gullberg, R. C.; Le Guen, Y.; Doench, J. G.; Egan, E. S.
Show abstract
Terminal erythroid differentiation involves dramatic cellular remodeling that culminates in the expulsion of the nucleus, a process known as enucleation. While enucleation is conserved across mammals and is crucial for the generation of fully functional erythrocytes, the mechanisms governing this process have remained largely unknown, in part because the absence of genetic material in mature, enucleated red blood cells hinders genetic experimentation. Here, we performed a pooled, forward-genetic CRISPR-Cas9 screen in enucleated red blood cells derived from primary human hematopoietic stem cells to identify genes required for enucleation. We found that Chloride Intracellular Channel 3 (CLIC3) and Vesicle-associated membrane protein 8 (VAMP8) are both necessary for terminal erythroid differentiation, yet likely act through different mechanisms. Knockdown of CLIC3 led to a delay in erythroblast differentiation, culminating in impaired enucleation. We found that the knockdown cells had increased p53 and p21 and exhibited cell cycle alterations, suggesting CLIC3 plays a crucial role in coordinating cell cycle progression during erythropoiesis. In comparison, VAMP8-depleted cells initially appear to undergo accelerated differentiation but then display a specific defect in enucleation. Transcriptional analysis of the VAMP8-knockdown cells suggested dysregulation of pathways for vesicle trafficking and actin binding, and imaging of late-stage erythroblasts revealed impaired nuclear polarization and disorganized actin. This work provides a new approach for functional genomics in enucleated cells and reveals novel factors important for terminal erythroid differentiation and enucleation. Key pointsO_LIA CROPseq-based CRISPR-Cas9 screen enables functional genomics in enucleated primary human red blood cells. C_LIO_LIChloride Intracellular Channel 3 (CLIC3) and Vesicle Associated Membrane Protein 8 (VAMP8) were identified as critical for terminal erythroid differentiation and enucleation, likely acting through two distinct mechanisms. C_LI
Briski, O.; Fagali Franchi, F.; Piga, E.; Franciosi, F.; Nag Bonumallu, S. K.; Baro Graf, c.; Lode, V.; Luciano, A. M.; Krapf, D.
Show abstract
In vitro fertilization (IVF) is key for genetic improvement programs in bovine. However, embryos produced through IVF have lower developmental competence than those produced under in vivo conditions. Conventional sperm preparation for IVF typically relies on heparin for sperm capacitation but fails to replicate the finely tuned molecular environment of the oviduct, resulting in compromised embryonic competence. Here, we evaluated the effect of HyperBull, a novel capacitation technology, on bovine IVF outcomes using unsorted cryopreserved semen. In a split-sample design, 528 cumulus-oocyte complexes were co-incubated with either control or HyperBull capacitated spermatozoa from the same bull. While overall blastocyst rates were not significantly different between groups (34.21% HyperBull vs. 28.63% control, p=0.148), the proportion of hatched embryos was significantly higher in the HyperBull group (15.82% vs. 9.13%, p=0.016). These findings suggest that modulating capacitation signals prior to insemination enhances embryonic developmental competence, thereby improving readiness for implantation. HyperBull may thus represent a valuable tool to increase the efficiency of IVF programs.
Kang, M.; Cabral, A. T.; Sawant, M.; Thiam, H. R.
Show abstract
Quantifying chromatin-state dynamics in living cells remains challenging, in part because most methods require fixation or cell lysis. Here, we benchmark and introduce three simple live-cell image-derived metrics computed from routine DNA staining--the coefficient of variation (CV), 1-Gini, and the Diffuse Signal Index (DSI), introduced here--as fixation-free readouts of chromatin state. Using HL60-derived neutrophils (dHL-60) undergoing NETosis as a model system with a pronounced compact-to-decompact chromatin transition, we show that all three metrics track progressive chromatin reorganization in live-cell trajectories, but differ markedly in sensitivity: DSI provides the strongest trajectory-level discrimination between NETing and non-NETing cells, followed by 1-Gini and CV. Comparison with Tn5-based chromatin accessibility measurements in fixed cells further shows that all three metrics correlate with chromatin accessibility, supporting their biological relevance. Together, our results provide a practical framework for extracting chromatin-state readouts from routine live-cell DNA staining and identify DSI as the most discriminative metric for tracking chromatin reorganization in this benchmark.
Schobert, M.; Boehm, S.; Borisov, O.; Li, Y.; Greve, G.; Edemir, B.; Woodward, O. M.; Jung, H. J.; Koettgen, M. M.; Westermann, L.; Schlosser, P.; Hutter, F.; Kottgen, A.; Haug, S.
Show abstract
BackgroundKidney cell lines are widely used to model kidney physiology and disease; however, their gene expression profiles may differ from primary cells due to immortalization, culture conditions, or experimental treatments. Determining whether a cell line resembles its native cell type is critical for interpreting in vitro findings. We developed a transcriptome-based approach that matches bulk RNA-seq data from kidney cell lines, primary cells, or tissues to reference cell types derived from single-cell RNA-seq (scRNA-seq) datasets. MethodsReference transcriptomic profiles were generated from two human and two murine kidney scRNA-seq datasets by pseudobulk aggregation. Bulk RNA-seq data from microdissected kidney tissue, non-kidney negative controls, and kidney cell lines were matched to these references using three statistical similarity measures (Spearman correlation, Euclidean distance, Poisson distance) and three machine learning classifiers (Random Forest, XGBoost, TabPFN). Each was assessed with global gene expression, curated kidney marker gene lists, and the most variable genes. Matching accuracy was evaluated through a three-step validation strategy: within-dataset matching, cross-reference comparison, and validation against primary kidney tissue and negative controls. ResultsGene expression rank-based Spearman correlation and TabPFN, a foundation model for tabular data, emerged as the most accurate and specific approaches, particularly with curated kidney marker gene lists. Both methods correctly identified microdissected kidney tubule segments and were robust against non-kidney negative controls. Applied to commonly used kidney cell lines, OK cells retained proximal tubule identity, particularly under shear stress, while other proximal tubule lines (HK-2, HKC-8, HKC-11) showed inconsistent matching. Collecting duct-derived mIMCD-3 maintained stable similarity across passages, culture conditions, and genetic modifications. ConclusionWe provide two complementary implementations: CellMatchR, an accessible web-based tool using Spearman correlation for routine use, and comprehensive scripts for TabPFN-based matching (link will be added after peer reviewed publication). Together, these resources enable researchers to make informed decisions about kidney cell culture model selection, interpretation, and stability. Translational StatementKidney cell lines are fundamental tools in nephrology research, yet their transcriptomic similarity to native cell types is rarely validated systematically. We demonstrate that combining bulk RNA-seq data with single-cell reference datasets enables robust assessment of cell line identity using gene expression-rank-based correlation and machine learning approaches. By providing a comprehensive evaluation of matching methods, curated kidney marker gene lists, and reference datasets, our study serves as both a practical resource and a methodological framework for the kidney research community, facilitating informed selection of cell culture models, quality control of experimental conditions, developing new experimental cell culture models, and more reliable translation of in vitro findings to kidney physiology and disease.
Heaton, H.; Behboudi, R.; Ward, C.; Weerakoon, M.; Kanaan, S.; Reichle, S.; Hunter, N.; Furlan, S.
Show abstract
The existence of rare, genetically distinct cells can occur in various samples such as transplant patients, naturally occurring microchimerism between maternal and fetal tissues, and cancer samples with sufficient mutational burden. Computational methods for detecting these foreign cells are vital to studying these biological conditions. An application that is of particular interest is that of leukemia patients post hematopoietic cell transplant (HCT). In many leukemias, a primary therapy is HCT, after which, the primary genotype of the bone marrow and blood cells should be of donor origin. If cells exist that are of the patients genotype and the cell type lineage of the particular leukemia, this is known as measurable residual disease (MRD). If the MRD is high enough, this may represent a relapse of the patients leukemia. Furthermore, accurately estimating the MRD is important for driving clinical decision making for these patients. Here we present Cellector, a computational method for identifying rare foreign genotype cells in single cell RNAseq (scRNAseq) datasets. We show cellector accurately detects microchimeric cells down to an exceedingly low percentage of these cells present (0.05% or lower).
Sasaki, K.; Satouh, Y.; Michizaki, M.; Jinno-Oue, A.; Matsuzaki, T.
Show abstract
Understanding the functions of maternal effect genes during oocyte growth is essential for elucidating the mechanisms of oogenesis and early embryonic development. However, conventional gene knockout and conditional knockout approaches require extensive breeding and are time-consuming. Here, we present a rapid in vitro gene functional analysis system that combines microinjection of mRNA, siRNA and plasmid DNA into mouse secondary follicles with a two-step oocyte growth culture system. Mouse secondary follicles were subjected to microinjection of mCherry mRNA and subsequently cultured for 15 days to produce fully grown oocytes. mCherry fluorescence persisted throughout the oocyte growth period but declined rapidly after fertilization. Despite minor cellular damage occasionally caused by microinjection, injected follicles developed normally and retained developmental competence. To evaluate the efficiency of gene suppression, we introduced siRNA targeting Dnmt3l, which is abundantly expressed during oocyte growth phase. Although Dnmt3l deficiency is known not to affect oocyte growth, we observed that oocyte growth was maintained normally despite a marked reduction in endogenous Dnmt3l mRNA levels in our knockdown model. These results demonstrate that this method enables efficient manipulation of gene expression specifically during oocyte growth while preserving developmental competence, providing a versatile platform for rapid functional screening of maternal effect genes in vitro.
Kirshenboim, O.; Kabya, A.; Yehezkel-Imra, R.; Tshuva, Y.; Maiers, M.; Gragert, L.; Bashyal, P.; Israeli, S.; Louzoun, Y.
Show abstract
BackgroundThe success of hematopoietic stem cell transplantation (HSCT) depends critically on human leukocyte antigen (HLA) matching between donor and recipient. While traditional matching focuses on five classical HLA loci (A, B, C, DRB1, DQB1), clinical practice increasingly considers extended typing at nine loci, including DPA1, DQA1, DPB1, and DRB3/4/5. Furthermore, emerging evidence supports transplantation with up to three HLA mismatches under post-transplant cyclophosphamide (PTCy) regimens. However, current donor search algorithms cannot efficiently identify donors with multiple mismatches across extended HLA loci in real-time. MethodsWe developed GRIMM-II (GRaph IMputation and Matching, version II), which comprises two novel algorithms: ML-GRIM (Multi-Locus GRIM) for HLA imputation across multiple loci, and ML-GRMA (Multi-Locus GRMA) for real-time donor-patient matching with up to three mismatches. Both algorithms employ a two-stage approach that combines efficient candidate reduction through graph-theoretic frameworks with detailed genotype comparison. ML-GRIM partitions genotypes into class I (HLA-A, B, C) and class II (remaining loci) components, enabling memory-efficient storage and rapid candidate identification. ML-GRMA searches a pre-imputed donor graph composed of donor genotypes and their sub-components, then computes asymmetric graft-versus-host (GvH) and host-versus-graft (HvG) mismatch probabilities to provide clinically relevant compatibility assessments. Both imputation and matching tools are available as a web application at https://grimmard.math.biu.ac.il/ and through GitHub repositories at https://github.com/nmdp-bioinformatics/py-graph-imputation (imputation) and https://github.com/nmdp-bioinformatics/py-graph-match (matching). ResultsWe validated ML-GRMA and ML-GRIM using the WMDA3 (World Marrow Donor Association) validation dataset, successfully reproducing all previously reported matches while identifying numerous additional candidate donors not detected by previous algorithms. Further validation of ML-GRMA using 3,000 patients with artificially introduced mismatches (0-3 allele substitutions) demonstrated 100% sensitivity and specificity in identifying matching donors at expected mismatch levels. We validated ML-GRIM using simulated nine-locus typings derived from 8,078,224 US donors in the NMDP registry. The algorithm successfully imputed genotypes across variable numbers of typed loci while incorporating multiethnic haplotype frequencies. The algorithm achieved real-time performance with typical imputation times under one second and matching times of 1-13 seconds per patient for up to three mismatches, even when searching databases exceeding 8 million donors. Notably, ML-GRMA identified substantially more potentially suitable donors than traditional algorithms by accounting for the biological reality that GvH and HvG mismatches often differ, particularly for donors homozygous at specific loci. To evaluate ML-GRIM performance with low-resolution typing, we tested it on simulated 3-locus typings from the same population. The resulting imputation accuracy correlated with the mutual information between typed loci and complete genotypes. ConclusionsGRIMM-II provides a scalable, memory-efficient solution for nine-locus HLA imputation and real-time identification of donors with up to three mismatches. The graph-based framework supports dynamic registry updates and can readily accommodate additional HLA loci and matching criteria as clinical knowledge evolves. By expanding the pool of acceptable donors while maintaining computational efficiency, GRIMM-II addresses a critical need in contemporary transplantation practice, particularly for patients from underrepresented ethnic minorities who face lower probabilities of finding perfectly matched donors.
Lu, Y.; Pan, M.; Jamwal, V.; Locop, J.; Ruparelia, A. A.; Currie, P. D.
Show abstract
Quantitative histological analysis of skeletal muscle morphometry provides critical insights into muscle physiology but remains labor-intensive and technically demanding. While recent developments in machine-learning-based image segmentation techniques have facilitated large-scale tissue analysis, existing tools that automate muscle morphometry analysis are largely tailored to mammalian models, with limited applicability to teleosts. Moreover, there is a lack of effective tools for visualizing spatial organization and morphometric variability of teleost muscle fibers, a feature that is important for understanding hyperplastic muscle growth dynamics in teleosts. In this study, we show that cytoplasmic staining combined with deep learning-based cell segmentation offers a robust and accurate approach for automated muscle morphometry analysis in developing zebrafish. We also introduce a FIJI2 plugin, implemented in Jython, that streamlines both morphometric analysis and visualization. This tool accommodates shallow and deep learning-based segmentation techniques and incorporates novel quantification and visualization methods suited to teleost-specific muscle features, including mosaic hyperplasia dynamics. The plugin features an intuitive graphical user interface and is designed for flexibility, with minimal constraints regarding species, image quality, or staining protocol. Its modular architecture allows it to be used as a baseline for automated muscle morphometry analysis, while permitting integration with other tools and workflows.
JIA, S.; Lysenko, A.; Boroevich, K. A.; Sharma, A.; Tsunoda, T.
Show abstract
Prognostic stratification in multiple myeloma (MM) relies on staging systems that assign patients to fixed categories at diagnosis and discard the temporal information that accumulates during treatment. We developed a dynamic multimodal framework that predicts residual overall survival using observation windows ranging from 1 to 18 months post-diagnosis. The model integrates DeepInsight-transformed gene expression representation, longitudinal laboratory measurement trajectories across 10 analytes, and treatment history for three drug classes through an adaptive fusion mechanism that accounts for missing clinical observations. On the MMRF CoMMpass cohort (n = 752), five-fold cross-validation yielded a concordance index (C-index) of 0.773 {+/-} 0.024 and a time-dependent AUC at a 1-year prediction horizon (tdAUC1yr) of 0.789 {+/-} 0.021, outperforming all evaluated baseline methods including DeepSurv (0.633 {+/-} 0.095) and random survival forests (0.636 {+/-} 0.024) on matched cross-validation splits. Modality ablation identified longitudinal laboratory measurements as the strongest individual contributor (C-index 0.693); the DeepInsight spatial encoding of gene expression yielded higher discrimination than a multilayer perceptron (MLP) baseline operating on the same features (0.624 vs. 0.596). Kaplan-Meier analysis showed significant prognostic group separation at all primary landmarks (log-rank p < 0.001; hazard ratios 3.46-3.93). A distilled student model retaining only the DeepInsight representation and five baseline clinical features achieved C-index 0.672 and tdAUC1yr 0.740 on an independent microarray cohort (GSE24080, n = 507) without retraining. Interpretability analysis identified prognostic associations consistent with established myeloma biology, including ubiquitin-proteasome pathway genes, endoplasmic reticulum stress markers, and Interferon Alpha Response pathway enrichment.
Hu, K.; Rosenberg, A. F.; Song, Y.; Fan, C.-H.; Peng, Z.; Gao, M.; Chong, Z.
Show abstract
V(D)J recombination generates antigen receptor diversity in developing B and T cells. Long-read transcriptome technologies (e.g., PacBio Iso-Seq, Nanopore RNA/cDNA) capture full-length transcripts and thus resolve V(D)J events more accurately than short-read platforms. However, existing short-read tools are not applicable to or optimized for long-read data. We developed VDJcraft, the first integrated pipeline designed for V(D)J recombination analysis using long-read transcriptome sequencing data. The workflow uses a two-pass alignment strategy: global alignment to the GENCODE reference with minimap2, followed by local realignment and annotation using the international ImMunoGeneTics information system (IMGT). A customized module enhances D-gene detection sensitivity and positional precision. Sequencing errors are reduced through consensus-based correction toward the predominant subclass. Antigen-binding regions are annotated using IMGT-defined motifs to characterize CDRs and binding site composition. VDJcraft was validated on simulated and Human Genome Structural Variation Consortium (HGSVC) datasets and applied to disease datasets. It accurately recovered full-length V(D)J-C sequences and outperformed existing methods in gene detection and recombination accuracy. Long-read calls also showed significantly higher concordance with high-confidence short-read calls (Mann-Whitney U test, p = 1.55 x 10-4). Additionally, we identified 31 putative novel gene subclasses absent from the IMGT database from HGSVC datasets. Analyses of longitudinal blood samples from a COVID-19 patient revealed distinct V(D)J recombination patterns and segment enrichment, characterized by increased IGHV1-2 usage, enrichment of the IGHV3-7/IGHD6-9/IGHJ5_02 rearranged clonotype, and a transient peak in IgG2 levels at day 4 followed by a gradual return to baseline. In conclusion, VDJcraft provides a robust framework for long-read V(D)J characterization and enables the discovery of disease-associated immune signatures.
Manna, I. I. A.; Wagle, U.; Balaji, B.; Lath, V.; Sampathila, N.; Sirur, F. M.; Upadya, S.
Show abstract
BackgroundSnakebite envenoming is a significant global health crisis that has been long neglected as a global health priority. It is a huge problem for rural communities of low and middle-income countries, India accounts for the largest proportion of snakebite deaths globally. Timely identification of venomous snakebite and its syndromic pattern is essential for effective administration of antivenom and supportive treatment. Expert identification of snake species and syndromes is not always available in peripheral healthcare settings. This leads to delays, unnecessary referrals, or improper treatment choices. Additionally, diverse snake species distribution and venom variations across regions pose challenges. AI-powered image classification methods can help overcome these barriers. We propose a clinically oriented deep learning pipeline for binary classification of venomous and non-venomous snake species of India using real-world imagery data. This pipeline would serve as a baseline step towards aiding snakebite management at peripheral healthcare setups with scarce resources. MethodsThe selected dataset consisted of 20 medically important Indian species. MobileViT-S, ConvNeXt-Tiny, EfficientNet-V2-S and ResNeXt-50 (32x4d) were trained under same conditions for comparison of results. Model interpretability was evaluated using Grad-CAM ++ to ensure that classification was not performed based on background but on features like head shape and stripes present on body. For reliable implementation we connected it to a web interface with human in loop expert verification. Experts can confirm or override predictions in real time. ResultsAmong the evaluated architectures, ResNeXt-50 (32x4d) showed the most reliable and consistent performance in classifying venomous and non-venomous snakes. It achieved the highest test accuracy, sensitivity, specificity, and F1-score. The model also had strong discriminative ability, with a ROC-AUC of 0.9950 and PR-AUC of 0.9959. These results indicate dependable performance in safety-critical screening situations. Grad-CAM++ visualizations confirmed that predictions were based on anatomically relevant features, especially in the head and body contour areas. This supports model interpretability and reduces background bias. ConclusionsAlthough the dataset size and single-institution source limit how widely the results can be applied, the proposed framework shows that its possible to create a clinically oriented, ready-to-use deep learning system for snakebite triage support. This system is intended as a scalable tool to help rural healthcare workers, emergency responders, and telemedicine platforms in areas where snakebites are common. Author SummarySnakebite is a major public health concern that disproportionally affects the rural population. Delays in identifying whether a snake is venomous often lead to delayed treatment, unnecessary use of antivenom, or inappropriate referrals. In many rural settings, access to expert snake identification is limited. To address this gap, authors have developed an artificial intelligence (AI)-based image classification system that distinguishes snakes into two clinically relevant categories: venomous or non-venomous. Unlike many previous studies that focused on ideal, high-quality wildlife images, our model was trained using real-world photographs captured in emergency situations, including images taken by patients and field responders under variable lighting and background conditions. This approach improves the models relevance to practical healthcare settings. The system achieved high accuracy and was further strengthened by visual interpretability tools and expert verification to ensure reliability. By combining AI-assisted classification with human oversight, this work provides a scalable decision-support tool that may improve early triage, rational antivenom use, and surveillance in snakebite-endemic regions